Balanced-MixUp for Highly Imbalanced Medical Image Classification

نویسندگان

چکیده

Highly imbalanced datasets are ubiquitous in medical image classification problems. In such problems, it is often the case that rare classes associated to less prevalent diseases severely under-represented labeled databases, typically resulting poor performance of machine learning algorithms due overfitting process. this paper, we propose a novel mechanism for sampling training data based on popular MixUp regularization technique, which refer as Balanced-MixUp. short, Balanced-MixUp simultaneously performs regular (i.e., instance-based) and balanced class-based) data. The two sets samples then mixed-up create more distribution from neural network can effectively learn without incurring heavily under-fitting minority classes. We experiment with highly dataset retinal images (55K samples, 5 classes) long-tail gastro-intestinal video frames (10K images, 23 classes), using CNNs varying representation capabilities. Experimental results demonstrate applying outperforms other conventional schemes loss functions specifically designed deal Code released at https://github.com/agaldran/balanced_mixup .

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predictive Data Mining for Highly Imbalanced Classification

The paper addresses some theoretical and practical aspects of data mining, focusing on predictive data mining, where two central types of prediction problems are discussed: classification and regression. Further accent is made on predictive data mining, where the time-stamped data greatly increase the dimensions and complexity of problem solving. The main goal is through processing of data (rec...

متن کامل

An Effective Approach for Imbalanced Classification: Unevenly Balanced Bagging

Learning from imbalanced data is an important problem in data mining research. Much research has addressed the problem of imbalanced data by using sampling methods to generate an equally balanced training set to improve the performance of the prediction models, but it is unclear what ratio of class distribution is best for training a prediction model. Bagging is one of the most popular and effe...

متن کامل

A Prediction for Classification of Highly Imbalanced Medical Dataset Using Databoost.IM with SVM

Recently, Class imbalance problems have growing interest because of their classification difficulty caused by the imbalanced class distributions. In particular, many ensemble learning and machine learning methods have been proposed for classification of imbalance problem. However, these methods producing poor predictive accuracy of classification for two-class imbalanced dataset. In this paper,...

متن کامل

On Mining Fuzzy Classification Rules for Imbalanced Data

Fuzzy rule-based classification system (FRBCS) is a popular machine learning technique for classification purposes. One of the major issues when applying it on imbalanced data sets is its biased to the majority class, such that, it performs poorly in respect to the minority class. However many cases the minority classes are more important than the majority ones. In this paper, we have extended ...

متن کامل

Classification of Imbalanced Marketing Data with Balanced Random Sets

With imbalanced data a classifier built using all of the data has the tendency the ignore the minority class. To overcome this problem, we propose to use an ensemble classifier constructed on the basis of a large number of relatively small and balanced subsets, where representatives from both patterns are to be selected randomly. As an outcome, the system produces the matrix of linear regressio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-87240-3_31